Text and Web Mining Approaches in Order to Build Specialized Ontologies

نویسندگان

  • Mathieu Roche
  • Yves Kodratoff
چکیده

This paper presents a text-mining approach in order to extract candidate terms from a corpus. The relevant candidates are selected using a web-mining approach. The terms (i.e. relevant candidate terms) we find are the instances of specialized ontologies built during this process. The experiments are based on real data – Human Resources corpus – and they show the quality of our text and web mining approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text mining tool for ontology engineering based on use of product taxonomy and web directory

This paper presents our attempt to build a text mining tool for collecting specific words – verbs in our case – that usually occur together with particular product category as support for ontology designers. As the ontologies are headstone for the success of the semantic web, our effort is focused on building small and specialized ontologies concerning one product category and describing its fr...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Product taxonomy and web directory as support for ontology engineers

This paper presents our attempt to build a text mining tool for collecting specific words – verbs in our case – that usually occur together with particular product category. These verbs could be used as a support for ontology designers in creating relations between entities and concepts. As the ontologies are headstone for the success of the semantic web, our effort is focused on building small...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Aisles through the Category Forest - Utilising the Wikipedia Category System for Corpus Building in Machine Learning

The Word Wide Web is a continuous challenge to machine learning. Established approaches have to be enhanced and new methods be developed in order to tackle the problem of finding and organising relevant information. It has often been motivated that semantic classifications of input documents help solving this task. But while approaches of supervised text categorisation perform quite well on gen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Digit. Inf.

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2009